AITopics | text simplification

Collaborating Authors

text simplification

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

2433fec2144ccf5fea1c9c5ebdbc3924-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 02:26:24 GMT

artificial intelligence, ca ter, natural language, (16 more...)

Neural Information Processing Systems

Industry: Information Technology (0.32)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.48)

Add feedback

Policy-based Sentence Simplification: Replacing Parallel Corpora with LLM-as-a-Judge

Wu, Xuanxin, Arase, Yuki, Nagata, Masaaki

arXiv.org Artificial IntelligenceDec-9-2025

Sentence simplification aims to modify a sentence to make it easier to read and understand while preserving the meaning. Different applications require distinct simplification policies, such as replacing only complex words at the lexical level or rewriting the entire sentence while trading off details for simplicity. However, achieving such policy-driven control remains an open challenge. In this work, we introduce a simple yet powerful approach that leverages Large Language Model-as-a-Judge (LLM-as-a-Judge) to automatically construct policy-aligned training data, completely removing the need for costly human annotation or parallel corpora. Our method enables building simplification systems that adapt to diverse simplification policies. Sentence simplification could benefit users with reading difficulties, such as second-language (L2) learners and people with reading impairments (e.g., dyslexic individuals), by making text easier to read and understand (Alva-Manchego et al., 2020b). It involves a series of edits, such as lexical paraphrasing, sentence splitting, and removing irrelevant details (Xu et al., 2015). The preferred edit policy, i.e., permissible or appropriate edits in given texts, varies significantly depending on the target audience. In L2 education, one of the major application areas for simplification, previous work in both NLP and language education research has shown that the desired type and degree of simplification edits change depending on learner proficiency and readability levels (Agrawal et al., 2021; Zhong et al., 2020). Specifically, low-to intermediate-level learners benefit from a combination of lexical paraphrasing, structural modifications, and selective deletions to reduce cognitive load. In contrast, advanced learners benefit from lexical paraphrasing, which supports vocabulary acquisition (Chen, 2019), but they gain comparatively less from added cohesion or deletion (Hosoda, 2016; Zhong et al., 2020). Motivated by these findings, we introduce two distinct edit policies. As illustrated in Table 1, overall-rewriting simplification often combines lexical paraphrasing, structural modifications, and deletions to improve readability for intermediate-level language learners. In contrast, lexical-paraphrasing (Paetzold & Specia, 2016; Li et al., 2025) adheres to the original sentence closely while supporting more efficient vocabulary acquisition for advanced learners.

large language model, machine learning, simplification, (18 more...)

arXiv.org Artificial Intelligence

2512.06228

Country:

North America > United States (0.46)
North America > Mexico (0.28)
Asia > Japan (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

Readability Measures and Automatic Text Simplification: In the Search of a Construct

Cardon, Rémi, Doğruöz, A. Seza

arXiv.org Artificial IntelligenceNov-13-2025

Readability is a key concept in the current era of abundant written information. To help making texts more readable and make information more accessible to everyone, a line of researched aims at making texts accessible for their target audience: automatic text simplification (ATS). Lately, there have been studies on the correlations between automatic evaluation metrics in ATS and human judgment. However, the correlations between those two aspects and commonly available readability measures (such as readability formulas or linguistic features) have not been the focus of as much attention. In this work, we investigate the place of readability measures in ATS by complementing the existing studies on evaluation metrics and human judgment, on English. We first discuss the relationship between ATS and research in readability, then we report a study on correlations between readability measures and human judgment, and between readability measures and ATS evaluation metrics. We identify that in general, readability measures do not correlate well with automatic metrics and human judgment. We argue that as the three different angles from which simplification can be assessed tend to exhibit rather low correlations with one another, there is a need for a clear definition of the construct in ATS.

computational linguistic, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2511.09536

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.93)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Text Simplification with Sentence Embeddings

Shardlow, Matthew

arXiv.org Artificial IntelligenceOct-29-2025

Sentence embeddings can be decoded to give approximations of the original texts used to create them. We explore this effect in the context of text simplification, demonstrating that reconstructed text embeddings preserve complexity levels. We experiment with a small feed forward neural network to effectively learn a transformation between sentence embeddings representing high-complexity and low-complexity texts. We provide comparison to a Seq2Seq and LLM-based approach, showing encouraging results in our much smaller learning setting. Finally, we demonstrate the applicability of our transformation to an unseen simplification dataset (MedEASI), as well as datasets from languages outside the training data (ES,DE). We conclude that learning transformations in sentence embedding space is a promising direction for future research and has potential to unlock the ability to develop small, but powerful models for text simplification and other natural language generation tasks.

machine learning, natural language, simplification, (17 more...)

arXiv.org Artificial Intelligence

2510.24365

Country:

Europe (1.00)
Asia > Middle East (0.46)
North America > Mexico (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

DETECT: Determining Ease and Textual Clarity of German Text Simplifications

Korobeynikova, Maria, Battisti, Alessia, Fischer, Lukas, Gao, Yingqiang

arXiv.org Artificial IntelligenceOct-28-2025

Current evaluation of German automatic text simplification (ATS) relies on general-purpose metrics such as SARI, BLEU, and BERTScore, which insufficiently capture simplification quality in terms of simplicity, meaning preservation, and fluency. While specialized metrics like LENS have been developed for English, corresponding efforts for German have lagged behind due to the absence of human-annotated corpora. To close this gap, we introduce DETECT, the first German-specific metric that holistically evaluates ATS quality across all three dimensions of simplicity, meaning preservation, and fluency, and is trained entirely on synthetic large language model (LLM) responses. Our approach adapts the LENS framework to German and extends it with (i) a pipeline for generating synthetic quality scores via LLMs, enabling dataset creation without human annotation, and (ii) an LLM-based refinement step for aligning grading criteria with simplification requirements. To the best of our knowledge, we also construct the largest German human evaluation dataset for text simplification to validate our metric directly. Experimental results show that DETECT achieves substantially higher correlations with human judgments than widely used ATS metrics, with particularly strong gains in meaning preservation and fluency. Beyond ATS, our findings highlight both the potential and the limitations of LLMs for automatic evaluation and provide transferable guidelines for general language accessibility tasks.

large language model, machine learning, preservation, (16 more...)

arXiv.org Artificial Intelligence

2510.22212

Country:

Europe > Switzerland (0.28)
Europe > France (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Template-Based Text-to-Image Alignment for Language Accessibility: A Study on Visualizing Text Simplifications

Souayed, Belkiss, Ebling, Sarah, Gao, Yingqiang

arXiv.org Artificial IntelligenceOct-14-2025

Individuals with intellectual disabilities often have difficulties in comprehending complex texts. While many text-to-image models prioritize aesthetics over accessibility, it is not clear how visual illustrations relate to text simplifications (TS) generated from them. This paper presents a structured vision-language model (VLM) prompting framework for generating accessible images from simplified texts. We designed five prompt templates, i.e., Basic Object Focus, Contextual Scene, Educational Layout, Multi-Level Detail, and Grid Layout, each following distinct spatial arrangements while adhering to accessibility constraints such as object count limits, spatial separation, and content restrictions. Using 400 sentence-level simplifications from four established TS datasets (OneStopEnglish, SimPA, Wikipedia, and ASSET), we conducted a two-phase evaluation: Phase 1 assessed prompt template effectiveness with CLIPScores, and Phase 2 involved human annotation of generated images across ten visual styles by four accessibility experts. Results show that the Basic Object Focus prompt template achieved the highest semantic alignment, indicating that visual minimalism enhances language accessibility. Expert evaluation further identified Retro style as the most accessible and Wikipedia as the most effective data source. Inter-annotator agreement varied across dimensions, with Text Simplicity showing strong reliability and Image Quality proving more subjective. Overall, our framework offers practical guidelines for accessible content generation and underscores the importance of structured prompting in AI-generated visual accessibility tools.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.11314

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

Toward Human-Centered Readability Evaluation

İlgen, Bahar, Hattab, Georges

arXiv.org Artificial IntelligenceOct-14-2025

Text simplification is essential for making public health information accessible to diverse populations, including those with limited health literacy. However, commonly used evaluation metrics in Natural Language Processing (NLP), such as BLEU, FKGL, and SARI, mainly capture surface-level features and fail to account for human-centered qualities like clarity, trustworthiness, tone, cultural relevance, and actionability. This limitation is particularly critical in high-stakes health contexts, where communication must be not only simple but also usable, respectful, and trustworthy. To address this gap, we propose the Human-Centered Readability Score (HCRS), a five-dimensional evaluation framework grounded in Human-Computer Interaction (HCI) and health communication research. HCRS integrates automatic measures with structured human feedback to capture the relational and contextual aspects of readability. We outline the framework, discuss its integration into participatory evaluation workflows, and present a protocol for empirical validation. This work aims to advance the evaluation of health text simplification beyond surface metrics, enabling NLP systems that align more closely with diverse users' needs, expectations, and lived experiences.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.10801

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre:

Research Report (0.82)
Questionnaire & Opinion Survey (0.69)

Industry:

Health & Medicine > Therapeutic Area (0.94)
Health & Medicine > Consumer Health (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Inclusive Easy-to-Read Generation for Individuals with Cognitive Impairments

Ledoyen, François, Dias, Gaël, Lechervy, Alexis, Pantin, Jeremie, Maurel, Fabrice, Chahir, Youssef, Gouzonnat, Elisa, Berthelot, Mélanie, Moravac, Stanislas, Altinier, Armony, Khairalla, Amy

arXiv.org Artificial IntelligenceOct-2-2025

Ensuring accessibility for individuals with cognitive impairments is essential for autonomy, self-determination, and full citizenship. However, manual Easy-to-Read (ETR) text adaptations are slow, costly, and difficult to scale, limiting access to crucial information in healthcare, education, and civic life. AI-driven ETR generation offers a scalable solution but faces key challenges, including dataset scarcity, domain adaptation, and balancing lightweight learning of Large Language Models (LLMs). In this paper, we introduce ETR-fr, the first dataset for ETR text generation fully compliant with European ETR guidelines. We implement parameter-efficient fine-tuning on PLMs and LLMs to establish generative baselines. To ensure high-quality and accessible outputs, we introduce an evaluation framework based on automatic metrics supplemented by human assessments. The latter is conducted using a 36-question evaluation form that is aligned with the guidelines. Overall results show that PLMs perform comparably to LLMs and adapt effectively to out-of-domain texts.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.00691

Country: Europe > France (0.47)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Facilitating Cognitive Accessibility with LLMs: A Multi-Task Approach to Easy-to-Read Text Generation

Ledoyen, François, Dias, Gaël, Pantin, Jeremie, Lechervy, Alexis, Maurel, Fabrice, Chahir, Youssef

arXiv.org Artificial IntelligenceOct-2-2025

Simplifying complex texts is essential for ensuring equitable access to information, especially for individuals with cognitive impairments. The Easy-to-Read (ETR) initiative offers a framework for making content accessible to the neurodivergent population, but the manual creation of such texts remains time-consuming and resource-intensive. In this work, we investigate the potential of large language models (LLMs) to automate the generation of ETR content. To address the scarcity of aligned corpora and the specificity of ETR constraints, we propose a multi-task learning (MTL) approach that trains models jointly on text summarization, text simplification, and ETR generation. We explore two different strategies: multi-task retrieval-augmented generation (RAG) for in-context learning, and MTL-LoRA for parameter-efficient fine-tuning. Our experiments with Mistral-7B and LLaMA-3-8B, based on ETR-fr, a new high-quality dataset, demonstrate the benefits of multi-task setups over single-task baselines across all configurations. Moreover, results show that the RAG-based strategy enables generalization in out-of-domain settings, while MTL-LoRA outperforms all learning strategies within in-domain configurations.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.00662

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.87)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Text Adaptation to Plain Language and Easy Read via Automatic Post-Editing Cycles

Calleja, Jesús, Ponce, David, Etchegoyhen, Thierry

arXiv.org Artificial IntelligenceSep-16-2025

We describe Vicomtech's participation in the CLEARS challenge on text adaptation to Plain Language and Easy Read in Spanish. Our approach features automatic post-editing of different types of initial Large Language Model adaptations, where successive adaptations are generated iteratively until readability and similarity metrics indicate that no further adaptation refinement can be successfully performed. Taking the average of all official metrics, our submissions achieved first and second place in Plain language and Easy Read adaptation, respectively.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2509.11991

Country:

North America > United States (0.29)
Europe > Spain (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback